In recent years, deep convolutional neural networks have achieved state ofthe art performance in various computer vision task such as classification,detection or segmentation. Due to their outstanding performance, CNNs are moreand more used in the field of document image analysis as well. In this work, wepresent a CNN architecture that is trained with the recently proposed PHOCrepresentation. We show empirically that our CNN architecture is able tooutperform state of the art results for various word spotting benchmarks whileexhibiting short training and test times.
展开▼